Voice expression conversion with factorised HMM-TTS models

نویسندگان

  • Javier Latorre
  • Vincent Wan
  • Kayoko Yanagisawa
چکیده

This paper proposes a method to modify the expression or emotion in a sample of speech without altering the speaker’s identity. The method exploits a statistical speech model that factorises the speaker identity from expressions using linear transforms. For this approach, the set of transforms that best fit the speaker and expression of the input speech sample are learned. They are then combined with the expression transforms of the desired expression taken from another speaker. Since the combined expression transform is factorised and contains information about expression only, it may be applied to the original speech sample to modify its expression to the desired one without altering the identity of the speaker. Notably, this method may be applied universally to any voice without the need for a parallel training corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Advances in Spectral Parameterization for Statistical (HMM-Based) TTS

HMM-based parametric speech synthesis has recently become an alternative to the concatenative TTS approach, especially when low footprint and general speech domain are required. A majority of speech parameterization models used in state-ofthe art HMM TTS systems employ source-filter waveform synthesis schemes. Sinusoidal representation and waveform generation of speech is an alternative to the ...

متن کامل

A Statistical Sample-Based Approach to GMM-Based Voice Conversion Using Tied-Covariance Acoustic Models

This paper presents a novel statistical sample-based approach for Gaussian Mixture Model (GMM)-based Voice Conversion (VC). Although GMM-based VC has the promising flexibility of model adaptation, quality in converted speech is significantly worse than that of natural speech. This paper addresses the problem of inaccurate modeling, which is one of the main reasons causing the quality degradatio...

متن کامل

Performance Analysis of Text To Speech Synthesis System Using HMM And Prosody Features With Parsing For Tamil Language

This paper describes a Hidden Markov Model (HMM) based (TTS) system and prosody based (TTS) system for producing natural sounding synthetic speech in Tamil language. The (HMM) based system consists of two phases such as training and synthesis. Tamil speech is first parameterized into spectral and excitation features using Glottal Inverse Filtering (GIF). An emotions present in the input text is...

متن کامل

MARY TTS unit selection and HMM-based voices

This paper describes the implementation of a unit selection English voice and a HMM-based Hindi voice for our participation in the Blizzard Challenge 2013. The two voices have been created using the MARY TTS voice building framework. We describe how audiobook data is used to create the English voice and how a quality controlmeasure (statisticalmodel cost) is used to control the selection of uni...

متن کامل

Synthetic Voice Forgery in the Forensic Context: a short tutorial

Technical voice forgery in the forensic area has led to several studies, mainly dealing with voice conversion. In the last decade, latests developments around voice synthesis have reached satisfactory intelligibility and quality levels. Moreover several web-based or standalone apps can be used for TTS. Nowadays, HMM-based synthetic voices can be built to fool biometric systems. Several authors ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014